Online Retail Dataset

In this article, we demonstrate calculting RFM), which is a method used for analyzing customer value, for Online Retail Dataset from UC Irvine Machine Learning Repository.

Data Set Information:

This is a transnational data set that contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail. The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers.

Attribute Information:

Droping NaN Values

Removing duplicated entries:

Recency, Frequency, Monetary (RFM) Scoring

Next, we create some new features.

Feature Description
Total_Spending Total amount of spending
Recency Days since the last purchase
Frequency Transactions numbers over a defined period
Monetary Total spending over a defined period
RFM Group Assigning a customer to an RFM group based on their recency, frequency, monetary

Customers here are divided into several groups based on their recency, frequency, monetary.

A representation of the distribution of data can be found using pandas hist function:

Calculating RFM Scores


References

  1. Webber, R., 2013. The evolution of direct, data and digital marketing. Journal of Direct, Data and Digital Marketing Practice, 14(4), pp.291-309.
  2. Singh, A., Rumantir, G., South, A. and Bethwaite, B., 2014, August. Clustering experiments on big transaction data for market segmentation. In Proceedings of the 2014 International Conference on Big Data Science and Computing (pp. 1-7).
  3. You, Z., Si, Y.W., Zhang, D., Zeng, X., Leung, S.C. and Li, T., 2015. A decision-making framework for precision marketing. Expert Systems with Applications, 42(7), pp.3357-3367.